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1.0  INTRODUCTION 


In  the  search  for  improved  detection  performance  in  sonar  signal  processing,  there 
has  been  a  trend  toward  the  use  of  more  complex  processing  methods.  An  interesting 
example  is  matched  field  processing,  in  which  the  assumptions  of  plane  wave  propaga¬ 
tion  are  discarded  in  favor  of  more  detailed  models  of  ocean  acoustics.  The  extra 
detection  performance  of  these  methods  is  achieved  at  the  expense  of  additional  com¬ 
putational  effort.  However,  the  increasing  availability  of  parallel  computers  motivates 
us  to  explore  the  application  of  these  new  machines  to  challenging  problems  of  sonar 
signal  processing. 

This  report  discusses  work  performed  to  implement  matched-field  processing  on  the 
Thinking  Machines  Corporation’s  Connection  Machine  (model  CM-2).  This  was  part  of 
a  task  with  twofold  objectives.  One  was  to  develop  a  high-performance  computing  ca¬ 
pability  for  the  specific  matched  field  processing  application.  The  other  was  to  advance 
generic  software  technology,  specifically  to  address  the  difficult  issue  of  software  port¬ 
ability  for  parallel  machines.  In  this  report,  the  discussion  will  be  focused  primarily  on 
the  former  objective. 
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2.0  MATCHED  FIELD  PROCESSING 


Many  future  undersea  surveillance  systems  are  likely  to  incorporate  some  form  of 
the  signal  processing  technique  known  as  matched  field  processing  (MFP).  The  essence 
of  the  method  is  depicted  in  figures  1  through  3.  Output  power  indicates  the  degree  of 
match  between  measured  sound  pressure  fields  (from  sensor  data)  and  model  predic¬ 
tions  (from  replica  data).  The  output  power  is  to  be  computed  for  a  multitude  of 
ranges,  azimuths,  depths,  and  frequencies.  An  important  observation  is  that  matched 
field  processing  has,  to  varying  degrees  along  the  processing  chain,  high  levels  of  par¬ 
allelism  in  the  frequency,  spatial  location,  and  sensor  dimensions.  For  example,  PIT's 
can  be  computed  in  parallel  for  all  sensors;  each  FFT  has  further  levels  of  exploitable 
parallelism  (i.e.,  individual  butterfly  computations). 

There  are  a  number  of  variants  of  matched  field  processing.  In  this  task,  it  was 
initially  planned  to  implement  four  different  forms  of  matched  field  processing, 
referred  to  as  subsampled  MVDR,  full  MVDR,  conventional  MFP,  and  array  partition¬ 
ing.  The  most  general  form  of  these  four  is  array  partitioning,  which  is  the  method 
shown  in  figure  1.  (Array  partitioning  is  described  in  more  detail  in  the  appendix.)  By 
performing  the  quadratic  forms  part  of  the  computation  in  different  ways,  either 
Bartlett  processing  or  minimum  variance  distortionless  response  (MVDR)  processing 
can  be  considered.  MVDR  is  also  known  as  the  maximum  likelihood  method.  These 
two  alternatives  for  the  quadratic  forms  are  discussed  in  |Baggeroer,  et  a!.,  1988).  Sub- 
sampled  MVDR  and  full  MVDR  are  specializations  in  which  the  spatial  filtering  and 
summation  over  subarray  is  bypassed.  Subsampled  MVDR  and  full  MVDR  are  actually 
the  same  algorithm  with  different  implementation  details  on  a  moderately  parallel  ma¬ 
chine  (subsampled  MVDR  would  perform  matrix  algebra  computations  with  -ut  inter¬ 
processor  communication;  full  MVDR  would  employ  interprocessor  comOi.’nication;  the 
distinction  between  subsampled  and  full  disappears  on  the  Connection  ^lachine).  The 
MVDR  processing  chain  is  shown  in  figure  2.  Conventional  MFP  is  *  ’.e  further  speciali¬ 
zation  in  which  Bartlett  processing  takes  the  place  of  the  minimum  variance  computa¬ 
tions;  that  is,  the  subarray  matrix  factoring  is  bypassed.  This  is  shown  in  figure  3. 

Each  method  has  its  own  advantages  and  disadvantages.  Conventional  MFP  is  the 
simplest  and  was  implemented  on  the  Connection  Machi-.ic  (apart  from  the  computa¬ 
tion  of  the  narrowband  time  series).  MVDR  is  somewhat  more  complicated  and  com¬ 
putationally  expensive  than  conventional  MFP,  but  yields  better  detection  performance. 
Array  partitioning  is  the  most  complicated,  but  has  the  potential  to  yield  much  better 
detection  performance  for  a  given  level  of  co'‘.putational  effort. 
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Figure  1.  Matched-field  processing  with  array  partitioning. 
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Figure  2.  Minimum  variance  distortionless  response  processing. 
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Figure  3.  Conventional  matched-field  processing. 
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3.0  TARGET  HARDWARE 


A  Connection  Machine  contains  thousands  of  bit-serial  processors  arranged  in 
groups  of  32.  Each  group  of  32  physical  processors  consists  of  two  custom  (CM) 
chips,  each  containing  16  physical  processors,  together  with  a  memory  chip,  a  floating¬ 
point  accelerator  chip,  and  a  chip  to  interface  the  floating-point  accelerator  with  the 
memory.  By  means  of  a  time-slicing  technique,  each  physical  processor  can  perform 
virtual  processing;  in  other  words,  the  Connection  Machine  can  be  operated  to  appear 
transparently  to  have  a  larger  number  of  physical  processors  than  it  actually  has.  The 
ratio  of  virtual  processors  to  physical  processors  is  referred  to  as  the  virtual  processor 
ratio  (VPR).  In  general,  it  is  advantageous  to  be  able  to  use  high  VPR,  since  this  leads 
to  more  efficient  processing.  Processors  can  communicate  with  one  another  either 
through  the  router,  which  allows  any  processor  to  communicate  with  any  other  proces¬ 
sor,  or  through  the  north-east-west-south  (NEWS)  grid,  which  permits  communication 
over  an  N-dimensional  rectangular  mesh.  An  important  observation  is  that  the  commu¬ 
nications  e.xpense  is  highly  dependent  on  whether  the  router  or  the  NEWS  grid  is  used, 
and  on  whether  the  communications  are  intragroup  or  intrachip.  This  has  important 
implications  for  the  way  the  data  structures  of  the  algorithm  should  be  arranged  over 
the  processors  of  the  CM-2.  The  activities  of  the  Connection  Machine  are  coordinated 
by  a  conventional  sequential  computer  known  as  the  front  end. 

Other  noteworthy  features  of  the  Connection  Machine  are  the  data  vault,  a  disk- 
array-based  mass  storage  device,  and  the  framebuffer,  a  high-resolution  graphics  dis¬ 
play.  Both  of  these  facilities  make  use  of  the  parallel  processing  features  of  the  CM-2 
to  achieve  data  transfer  at  high  rates.  It  is  natural  to  exploit  parallelism  in  the  I/O  as 
well  as  in  the  numerical  computations,  and  this  was  an  important  element  of  the  work. 

The  near-term  preferred  target  machine  for  this  effort  was  the  AT&T  DSP3,  a  mod¬ 
erately  parallel  multipie-instruction-stream  multiple-data-stream  (MIMD)  machine 
(Shively,  et  al..  1989|.  The  immediate  matched-field  processing  requirements  were  to 
treat  problems  with  tens  to  hundreds  of  sensors  and  up  to  tens  of  frequencies,  which 
appeared  to  be  well  suited  to  the  128  processors  of  the  DSP3.  Because  the  DSP3  is  an 
MIMD  machine,  it  affords  the  opportunity  to  work  on  different  parts  of  the  processing 
chain  concurrently.  Because  the  DSP3  was  not  available  at  the  start  of  the  effort,  the 
Connection  Machine  (CM-2)  from  Thinking  Machines  Corporation,  a  massively  parallel 
singlc-instruction-stream  multiple-data-strcam  (SIMD)  machine  was  used  initially.  The 
configurations  of  the  CM-2  that  were  available  for  this  task  had  4096,  8192,  and 
16.384  processors.  One  of  the  benefits  of  using  the  CM-2  was  that  its  very  different 
architecture  and  programming  environment  provided  an  expanded  base  of  experience 
useful  for  later  addressing  software  portability  issues.  Detailed  discussions  of  the  Con¬ 
nection  Machine  are  found  in  [Hillis.  19851  and  [cm2tccsuml. 
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4.0  SOFTWARE  PORTABILITY  FOR  PARALLEL  PROCESSORS 


The  initial  attempt  at  addressing  the  portability  issue  was  to  employ  a  conventional 
approach  of  phased  development  to  separate  the  requirements  and  high-level  design 
from  implementation  details.  Functional  descriptions  of  the  matched  field  processing 
algorithms  were  prepared  and  reviewed.  Code  was  then  written  from  these  functional 
descriptions.  Intermingling  of  front-end  data  structures  and  code  with  parallel  processor 
data  structures  and  code  was  kept  to  a  minimum.  The  front-end  data  structures  and 
code  were  written  in  the  C  language,  while  the  parallel  processor  data  structures  and 
code  were  expressed  in  C*,  an  extension  of  C  developed  for  the  Connection  Machine. 
Similarly,  processes  dealing  only  with  interprocessor  communication  were  separated 
from  processes  involving  numerical  computations.  From  the  functional  descriptions 
were  derived  requirements  specifications  in  DOD-STD-2167A  format  [mvdrsrs], 
[apasrs]  and  pseudocode  documents  [mvdrpseu],  [apapseu]  to  facilitate  future  soft¬ 
ware  development. 

The  approach  described  above  has  severe  limitations.  A  key  difficulty  is  that  the 
“distance”  or  dissimilarity  between  the  code  and  a  relatively  machine-independent 
intermediate  representation  (e.g.,  pseudocode)  is  great.  Consequently,  the  effort  in 
translating  from  a  high-level  representation  to  code  is  substantial  and  this  effort  must 
still  be  expended  anew  with  each  new  machine. 
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5.0  MAPPING  ONTO  THE  CONNECTION  MACHINE 

The  CM-2  source  code  for  the  conventional  MFP  appears  in  [apasclj. 

Matched-field  processing  (as  well  as  similar  signal-processing  algorithms)  consists 
of  a  chain  of  processes  with  the  outputs  of  one  process  serving  as  the  inputs  to  the 
next  process  in  the  chain.  A  massively  parallel  implementation  of  each  process 
involves  the  use  of  N-dimensional  rectangular  meshes  over  w'hich  the  data  are  arranged 
for  the  parallel  computations,  with  different  meshes  (including  different  values  of  N) 
being  appropriate  for  the  different  processes  of  the  algorithm  and  different  stages 
within  a  process.  In  the  array  partitioning  algorithm,  the  processes  are  distribute  to 
PEs  by  sensor,  apply  windows  and  perform  FFT,  distribute  AzEl  vectors,  spatially 
filter  frequency  bin  data,  distribute  and  sum  NB  data  by  subarray  (called  distribute 
NB  data  in  the  non-array-partitioning  case),  factor  subarray  matri.x,  distribute  replica 
vectors,  compute  output  power  and  narrowband  time  series,  and  collect  from  PEs  by 
frequency  band.  These  are  discussed  in  the  appendix.  The  subset  capability  imple¬ 
mented  on  the  Connection  Machine  eonsisted  of  conventional  MFP  only,  with  no  com¬ 
pulation  of  the  narrowband  time  series.  The  processes  associated  with  this  subset  capa¬ 
bility  are  distribute  to  PEs  by  sensor,  apply  windows  and  perform  FFT,  distribute 
NB  data,  distribute  replica  vectors,  compute  output  power,  and  collect  from  PEs  by 
frequency  band. 

The  rectangular  mesh  associated  with  a  particular  process  reflects  the  parallelism 
inherent  in  that  process.  For  example,  in  the  apply  windows  and  perform  FFT  proc¬ 
ess.  the  data  are  naturally  arranged  over  a  two-dimensional  mesh,  with  the  dimensions 
corresponding  to  sensor  and  time  on  input  and  sensor  and  frequency  on  output.  It  is 
also  important  to  note  the  dimensions  with  respect  to  which  the  computations  are 
totally  decoupled  or  “embarrassingly  parallel”  (EP).  For  example,  in  the  apply  win¬ 
dows  and  perform  FFT  process,  all  sensor  channels  can  be  treated  completely  inde¬ 
pendently  of  one  another,  so  vve  say  the  process  is  EP  with  respect  to  sensor.  The  rec 
tangutar  meshes  are  indicated  in  figure  4.  with  the  EP  mesh  edges  indicated  in  upper 
case;  the  labeling  applies  at  the  conclusion  of  each  process’  execution.  By  identifying 
the  dimensions  over  which  the  processing  is  EP,  it  is  possible  to  decide  how  to  arrange 
data  over  the  processors  to  keep  the  communieations  costs  low.  The  parallelism  of  our 
Connection  Machine  implementation  of  conventional  MFP  is  shown  in  figure  5.  Real- 
world  limitations  such  as  finite  memory  prevent  us  from  exploiting  all  the  intrinsic  par¬ 
allelism  of  an  ideal  algorithm. 

It  should  he  noted  that  the  implementation  of  the  software  discussed  in  this  report 
does  not  exploit  the  “EP-ness"  of  the  problem  in  this  way  because  the  software 
uses  the  less  efficient  router  communications  only.  However,  it  should  not  be  too 
difficult  to  rewrite  the  software  to  use  the  more  efficient  N-d  grid  package  (from  the 
NRl.  C*  library),  which  employs  the  NEWS  grid  to  perform  fast  nearest-neighbor 
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Figure  4.  Parallelism  in  the  array-partitioning  processing  chain. 
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Figure  5.  Parallelism  in  the  Connection  Machine  implementation 
of  conventional  MFP. 


communications.  Some  extensions  to  the  N-d  grid  package  would  be  needed  to  exploit 
the  fact  that  communications  are  low  or  nonexistent  along  certain  mesh  dimensions.  It 
would  still  be  necessary  to  use  router  communications  in  some  parts  of  the  algorithm. 

The  distribute  to  PEs  by  sensor  process  uses  a  sensor/time  2-D  mesh,  and  is  EP 
with  respect  to  sensor  and  time. 

The  apply  windows  and  perform  FFT  process  uses  a  sensor/time  2-D  mesh  (input) 
and  sensor/frequency  2-D  mesh  (output),  and  is  EP  with  respect  to  sensor. 

The  distribute  AzEl  vectors  process  uses  a  sensor/frcquency  2-D  mesh  (input)  and 
sensor/frequency-azimuth-clevation  2-D  mesh  (output),  and  is  EP  with  respect  to  sensor 


and  frequency-azimuth-elevation.  This  was  not  a  part  of  our  Connection  Machine 
implementation. 

The  spatially  filter  frequency  bin  data  process  uses  a  sensor/frequency- 
azimuth-elevation  2-D  mesh,  and  is  EP  with  respect  to  sensor  and  frequency- 
azimuth-elevation.  This  was  not  a  part  of  our  Connection  Machine  implementation. 

The  distribute  and  sum  NB  data  by  subarray  process  uses  a  sensor/frequency- 
azimuth-elevation  2-D  mesh  (input)  and  a  subarray/frequency-azimuth-elevation/time 
epoch  group  3-D  mesh  (output),  and  is  EP  with  respect  to  subarray  and  frequency- 
azimuth-elevation.  Our  Connection  Machine  implementation  (of  distribute  NB  data)  is 
EP  with  respect  to  frequency  only. 

The  factor  subarray  matrix  process  uses  a  subarray/frequency-azimuth-elevation/ 
time  epoch  group  3-D  mesh,  and  is  EP  with  respect  to  frequency-azimuth-eievation. 
This  was  not  a  part  of  our  Connection  Machine  implementation. 

The  distribute  replica  vectors  process  uses  a  subarray/frequency-azimuth-elevation/ 
spatial  location  3-D  mesh,  and  is  EP  with  respect  to  subarray,  spatial  location,  and 
frequency-azimuth-elevation.  Our  Connection  Machine  implementation  is  EP  with 
respect  to  sensor  and  frequency  only. 

The  compute  output  power  and  narrowband  time  series  process  uses  a  subarray/ 
(spatial  location  or  time  epoch)  column  group/frequency-azimuth  elevation  3-D  mesh 
(input)  and  a  spatial  location/frequency-azimuth-elevation  2-D  mesh  (output),  and  is  EP 
with  respect  to  spatial  location  and  frequency -azimuth-elevation.  Our  Connection 
Machine  implementation  was  EP  with  respect  to  frequency  only. 

The  collect  from  PEs  by  frequency  band  process  uses  a  spatial  location/frequency- 
azimuth-elevation  2-D  mesh,  and  is  EP  with  respect  to  spatial  location  and  frequency- 
azimuth-elevation. 

Note  that  downstream  of  the  apply  windows  and  perform  FFT  process,  the  entire 
processing  (sub)chain  is  EP  with  respect  to  frequency-azimuth-elevation. 


6.0  VISUALIZATION  FOR  MATCHED-FIELD  PROCESSING 


The  process  collect  from  PEs  by  frequency  band  produces  a  large  volume  of  out¬ 
put  data,  indexed  by  spatial  location  (range  and  depth),  frequency,  and  time  epoch. 
Because  matched-field  processing  is  a  relatively  unexplored  area  of  investigation,  it  is 
worthwhile  to  be  able  to  present  the  output  data  to  an  analyst  with  little  data  reduction 
so  as  to  foster  the  insights  needed  for  subsequent,  more  structured  statistical  analyses. 
For  example,  prior  to  attempting  an  empirical  probability  of  detection  analysis,  it  is 
necessary  to  have  a  reasonably  good  a  priori  knowledge  of  a  target’s  location  in  range 
and  depth,  a  task  that  is  made  difficult  by  the  ambiguities  introduced  by  the  compli¬ 
cated  propagation  of  sound  in  the  ocean. 

An  approach  to  presenting  the  kind  of  multidimensional  data  set  used  in  this  task 
was  to  employ  the  Connection  Machine’s  framebuffer  to  rapidly  play  back  outputs 
stored  on  the  data  vault  for  many  time  epochs  as  an  animation  or  “movie.”  Such  a 
movie  consists  of  a  series  of  frames  appearing  on  the  display  in  rapid  succession.  Each 
frame  consists  of  a  collection  of  B-scan  displays,  each  one  corresponding  to  a  different 
frequency.  Each  B-scan  display  indicates  output  power  as  gray  level  (as  a  function  of 
range  and  depth).  This  is  illustrated  in  figure  6. 
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Figure  6.  Frame  format  for  visualization 
of  matched-field  processor  output. 


7.0  PRELIMINARY  PERFORMANCE  EVALUATION 


A  rudimentary  evaluation  of  the  implementation  of  conventional  MFP  on  the  Con¬ 
nection  Machine  was  done  to  gauge  the  performance,  at  least  in  order-of-magnitude 
terms.  The  parameters  of  the  test  case  were  as  follows:  4096  spatial  locations,  32  sen¬ 
sors,  8  retained  frequency  bins,  and  one  epoch  comprising  256  temporal  points  per 
FFT  window.  This  test  case  was  evaluated  by  using  a  CM-2  with  8192  physical  proces¬ 
sors.  The  elapsed  time  for  this  processing  was  approximately  8  minutes.  Roughly  three 
quarters  of  this  time  was  consumed  in  the  output  power  computation,  with  most  of  the 
remainder  arising  from  I/O.  Subsequent  analysis  suggested  that  this  extremely  poor 
performance  resulted  from  the  heavy  use  of  router  communication. 


7-1 


8.0  CONCLUSIONS  AND  RECOMMENDATIONS 


The  effort  described  in  this  report  pointed  up  a  number  of  opportunities  and 
difficulties  associated  with  implementing  matched-field  processing  and  similar  types  of 
sonar  signal  processing  on  massively  parallel  computers. 

Conventional  MFP  was  implemented  on  the  Connection  Machine.  In  this  initial 
implementation,  the  potential  of  the  CM-2  was  not  realized  because  the  programming 
style  and  language  features  used  led  to  a  large  interprocessor  communications  burden. 

In  subsequent  efforts  at  developing  signal  processing  on  parallel  processors,  there 
should  be  additional  emphasis  on  decomposing  the  overall  processing  into  a  relatively 
small  set  of  building  blocks  that  are  of  higher  level  than  elementary  arithmetic 
operations  on  scalars.  Broad  categories  of  these  low-level  building  blocks  would  include 
(i)  matrix  operations  such  as  those  of  the  basic  linear  algebra  subprograms;  (ii)  the 
fast  Fourier  transform;  (iii)  data  motion  primitives  to  support  such  non-numeric 
operations  as  buffering  with  overlap,  transpose,  gather/scatter,  and  others. 

One  of  the  issues  that  complicates  the  development  of  portable  parallel  libraries  is 
deciding  on  appropriate  arrangements  of  data  structures  over  distributed  memory.  For 
conventional  machines,  such  matters  as  row  or  column  ordering  and  strides  are  of 
concern.  For  parallel  computers,  the  characteristics  of  the  machine  play  a  more 
substantial  role  and  introduce  a  larger  range  of  choices  that  must  be  made. 

It  is  encouraging  to  observe  that  the  array  partitioning  version  of  matched-field 
processing  has  a  high  degree  of  exploitable  parallelism,  with  the  bulk  of  the  algorithm 
embarrassingly  parallel  with  respect  to  frequency-azimuth-elevation.  It  is  also  worth 
noting  that  the  processing  of  data  from  external  sources  is  not  the  only  situation  in 
which  massively  parallel  machines  and  algorithms  are  relevant.  When  detailed 
simulation  studies  are  to  be  performed,  there  is  an  additional  problem  dimension 
introduced,  namely  the  realizations  of  the  pseudorandom  sequences  used  to  generate 
databases  of  output  statistics.  In  this  case,  we  have  parallelism  with  respect  to 
frequency-azimuth-elevation  realization.  The  applicability  of  massively  parallel 
computers  to  these  simulation  studies  should  be  explored. 

Some  initial  explorations  were  made  in  visualizing  matched  field  processing  for  the 
case  of  no  azimuthal  resolution.  Introducing  the  additional  dimension  of  azimuth  will 
provide  new  challenges. 
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APPENDIX 


A. 1.0  Partitioned  Array  Bartlett  Program 

This  is  a  functional  description  for  a  program  which  forms  Bartlett  az¬ 
imuth  and  elevation  beams  for  each  subarray  of  a  partitioned  array,  then 
Bartlett  or  Minimum  Variance  Distortionless  Response  (MVDR)  Matched 
Field  Processing  (MFP)  to  combine  the  subarray  outputs. 

A.  1.1  Partitioned  Array  Bartlett  Program  Inputs 


R  aw  .Sensor. Data: 

TBD 

R  aw. Replica.  Vectors; 

TBD 

Parameters; 

N.Points.Per.F  pdate 
\. Sensors 
N.'Fimo.Ma.x 
N.FFT.Size 
I.Freq.Bin.First 
I-Freq.Bin.Last 
.N.Fre(|. Bins. Out 
N. Saved. Updates 
N  Fro<|  Bands 
.X.Freq. Bins. Per.  Band 
.V  J5U  barrays 

LFirst.Sensor[i]  i  =  0 N.Subarrays  -  1 

l.Last.Sensor[i]  i  =  (),...,.\..Su barrays  -  1 

.\^\z  FI. Beams 

N..\z  FI -Beams. Per.Batch 

N'..\zFl-Batches 

.N-Retained.'l'imes 

N.Replicas 

N-Repliras.Per.Balch 
.V.Replica.Bal  ches 
LB..\IF-Flng 

(^R  .Parameters; 

In  verse. ( ’oiidit  ion  .N  umber.'l  h  reshold 
( ’onst  raiiits: 

.N.Frec]  IJins.Oul  -  ITreq  Bin.I.ast  IT’reij.Bin. First  +  1 
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N-FFT_Size  =  N_Savod.Updates  *  N.Poiuts.Per-Update 
N-Fr('<i_Bins.Oiit  =  N-Frpq-Bins.Per.Baiid  *  N_Freq_Bands 
N_.4zELBoariis  =  N„\zFi.Bcarns.P('r-Batch  *  N-AzELBatchrs 
N-lJpplicas  —  N_Rpplii as.Por.Battli  *  N-Replica-Batrhes 


A. 1.2  Partitioned  Array  Bartlett  Program  Input/Outputs 

none 

A.  1.3  Partitioned  Array  Bartlett  Program  Outputs 


Oiitpiit.Power; 

IBI) 

Narrow  band. 'I'inip.Sorios: 
TBI) 


A.  1.4  Partitioned  Array  Bartlett  Program  Algorithm 


Read  Paratnetors 
Do  oiu'-tiiiip  falfulations 
Open  input  and  output  data  files 
While  more  sensor  data  to  read 

Invoke  Distribute.to.PEs.byAiensor  process 
Invoke  Api)ly.VV'indows.and.Perform-FF’T  process 
While  more  .\zF',l  batches  to  read 

Invok<’  Distribute. Azi'-LVeclors  i)rocess 
Invoke  Si)at iallv-Filter.Frequency.Bin-Data  process 
Invoke  DistriI)ute_an(L.Sum .N B _ Dal a.by ..Subarray  process 
End  while 

Invoke  Faclor.Sul)array. Matrix  |)rocess 
W  Idle  more  repli<-as  to  read 

Invoke  Dist ribute.Re()lica. Vectors  process 

Invoke  ( 'omimte.Out pul  _ Power . and. Narrcw band. Time. Series  process 
I tr\'oke  ( 'o||c(  I  from  PEs  by .Frecjuenry.Band  process 
f  nd  w  bile 
I  lid  while 

('lii-.e  iiijiiit  .Old  oiiipiit  data  tiles 


A.  1.5  Partitioned  Array  Bartlett  Program  Special  Require¬ 
ments 
TBD 

A.  1.6  Partitioned  Array  Bartlett  Program  Validation  Criteria 

The  following  tests  shall  be  employed  to  validate  the  program: 

(i)  Simulated  acoustic  fields  arising  from  two  plane  waves,  together  with 
additive  white  Gaussian  noise,  independent  and  identically  distributed  from 
sensor  to  sensor,  shall  be  generated  and  supplied  as  Raw  J>ensor_Data.  The 
Output.Power  and  Narrowband  .Time_Series  shall  be  examined  for  agrc'e- 
rnc’nt  with  theoretical  predictions.  In  particular,  maximum  response  should 
result  from  those  replicas  corresponding  to  the  true  arrival  directions  of  the 
plane  waves. 

(ii)  Seatest  data  shall  be  procc'ssc'd  and  the  outputs  compared  with  those 
produced  by  existing  processing  software. 

A. 2.0  Distribute  to  PEs  by  Sensor  Process 

The  Distribute  to  PEs  by  Sensor  Process  accesses  from  mass  storage 
real  lime  series  indexed  by  time  and  sensor,  reorganizes  it  if  necessary,  and 
routes  it  to  PEs.  The  output  data  is  organized  in  time  updates,  one  sensor 
per  PK. 

A. 2.1  Distribute  to  PEs  by  Sensor  Process  Inputs 

RawTiensor_Data; 

THD 

Parameters; 

N_Points_Per  .Update 
N  JSensors 
.N  .Time. Max 
I'ime.Index; 

LTime 

A. 2. 2  Distribute  to  PEs  by  Sensor  Process  Input/Outputs 

Sensor.History; 

xh[i,  j.  kj  i  =  0 . N. Points. Per. Update  -  1. 

j  =  l) .  N. Sensors  -  1 

k  =  0,  ....  N.Saved.Updates  1 


xh  real 
K.01dest_Update 


A. 2. 3  Distribute  to  PEs  by  Sensor  Process  Outputs 

none 

A. 2.4  Distribute  to  PEs  by  Sensor  Process  Algorithm 


For  each  j  in  0,  N-Sensors  -  1 
F’ill  xh[i,  j,  K'.Oldest_Update] 

Find  for 

K -Oldest .Update  =  (  K .Oldest .Update  +  1  )  mod  N .Saved. Updates 


A. 2. 5  Distribute  to  PEs  by  Sensor  Process  Special  Require¬ 
ments 

The  Sensor.History  xh[]  shall  be  16-bit  real. 

A. 2.6  Distribute  to  PEs  by  Sensor  Process  Validation  Criteria 

The  following  test  shall  be  employed  to  validate  the  process: 

(i)  The  time  index  and  sensor  index  are  to  be  encoded  into  each  data 
value  of  Raw  .Sensor  JData.  The  Sensor  Jlistory  values  x[i,  j,  k]  shall  then  be 
examined  for  agreement  with  (i,  j). 

A. 3.0  Apply  Windows  and  Perform  FFT  Process 

The  Apply  Windows  and  Perform  FFT  Process  transforms  blocks  of  time 
series  to  the  frequency  domain.  A  circular  buffer  of  input  data  is  maintained. 

A. 3.1  Apply  Windows  and  Perform  FFT  Process  Inputs 


Spectral. Analysis.VV'^indow: 

w[i]  i  =  0,  N.FFT-Size  -  1 
w  real 

Sensor.History: 

xh[i,  j,  k]  i  =  0,  N.Points.Per. Update  -  1, 
j  =  0,  N-Sensors  -  1 
k  =  0,  N -Saved. Updates  -  1 

xh  real 
K.OIdest.Update 
Pa  rarneters: 
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N  J’oints_Pcr.U  pdate 
N-Sensors 
N-FFT.Size 
N  -S  aved  .Updates 
N-TirneuMax 
LFreq.Fiin  J-^irst 
I.Freq.Biri-Last 
\-Freq_Bins_out 
Time  .Index: 

LTime 


A. 3. 2  Apply  Windows  and  Perform  FFT  Process  Input/Outputs 

none 

A. 3. 3  Apply  Windows  and  Perform  FFT  Process  Outputs 


R  a  vv  .Freq  u  e  n  c  y  .B  i  n .  D  a  t  a : 

yr[i,jl  '  i  =  0 . N.FFT.Size-1 

j  =  0.  N.Sensors  -  1 
yr  complex 
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A. 3.4  Apply  Windows  and  Perform  FFT  Process  Algorithm 


Define  C(L)  =  L  mod  N_Points_Per-Update 

Define  D(L)  =  (  K-01dest_Update  +  L  /  N_Points-Per_Update  )  mod  N-Saved-Updates 
For  each  j  in  0,  ....  N.Sensors  -  1 

xw[i,  j]  =  w[i]  xh[C(i),  j,  D(i)], 
i  =  0,  ...,  N.FFT_Size  -  1 
yr[i,  j]  =  FFT(i;  N.FFT_Size;  xw[.,  j]), 
i  ^  0,  ...,  N_FFT_Size  -  1 

End  for 


A. 3. 5  Apply  Windows  and  Perform  FFT  Process  Special  Re¬ 
quirements 

The  Senror. History  xh[]  shall  be  16-bit  real. 

The  xw  arrays  shall  be  complex  so  that  a  complex-to-complex  FFT  may 
be  used. 

A, 3.6  Apply  Windows  and  Perform  FFT  Process  Validation 
Criteria 

Tests  for  validating  this  process  are  described  in  the  document  "Prelim¬ 
inary  Requirements  Specification:  Function  Validation". 

A. 4.0  Distribute  AzEl  Vectors  Process 

The  Distribute  .‘\zEl  Vectors  Process  routes  the  steering  vectors  for 
.Azimuth-Elevation  beams  .so  that  each  PE  has  the  vectors  for  all  frequency 
bins  to  be  processed,  and  for  the  channels  which  it  FFTed. 

A. 4.1  Distribute  AzEl  Vectors  Inputs 


Raw..\zimuth -Elevation -Vectors: 
TBD 

Parameters: 

.\'-Freq_Bands 
N-F'req-Bins-Per-Band 
.N_y\z  El-Beam  s_Per_Ba  tell 
N  -.Sensors 


A. 4. 2  Distribute  AzEl  Vectors  Input/Outputs 

none 
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A.4.3  Distribute  AzEl  Vectors  Outputs 


AzEl-Voctors: 

va[il,  i2.  k2,  j]  il  =  0,  ....  N-Frcq .Bands  -  1 

i2  =  0,  ....  N  Jreq.Bins.Por.Band  -  1 
k2  =  0,  ...,  N  JVzELBoams.Per.Batch  -  1 
j  =  0,  ....  N.Son.sor.s  -  1 
va  cornple.x 


A. 4. 4  Distribute  AzEl  Vectors  Process  Algorithm 
TBI) 

A. 4. 5  Distribute  AzEl  Vectors  Special  Requirements 
■Voiio 

A. 4. 6  Distribute  AzEl  Vectors  Validation  Criteria 
The  following  test  shall  be  employed  to  validate  the  process: 
rho  sensor  nutitber.  freqviency  bin.  and  AzFll  vector  nnnilter  shall  be 
(Micoded  in  the  Haw..-\zELVectors.  The  AzELVeclors  shall  be  examined  for 
agreement  with  [il.  i2.  k2,  j]. 

A. 5,0  Spatially  Filter  Frequency  Bin  Data  Process 

The  Spatially  Filter  Frequency  Bin  Data  Process  applies  the  weights  of 
each  ,‘\zEl  vectors  for  each  frequency  bin  to  each  sra.sor. 

A. 5.1  Spatially  Filter  Frequency  Bin  Data  Process  Inputs 


Raw  .Frequency -Bin  .Data; 

yr[i,  j]  i  =  0 .  .N’-FFTJSize  -  1 

j  -  0,  ....  .\..Sen,sors  -  I 
yr  complex 

AzELV  ect  ors: 

va[il,  i2.  k2,  j]  il  =  0 . .\_Freq_Bands  -  1 

i2  =  0 . N_Freq_Bins_Per_Banii  -  1 

k2  =  0.  ..  ,  .\_AzEl-Beams_Per_Batch  -  1 

I  =  0 .  .\ -Sensors  -  I 

va  comi>lex 


A.i^.2  Spatially  Filter  Frequency  Bin  Data  Process  Input/Outputs 
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none 


A. 5.3  Spatially  Filter  Frequency  Bin  Data  Process  Outputs 


Raw_Filterod_NB_Data: 

fr[il,  i2,  kl,  k2,  j]  il  =  0,  N^Freq.Bands  -  1 

i2  =  0,  N_Freq_Bins_Per_Band  -  1 
kl  =  0,  N -A  z  El -Batches  -  1 

k2  =  0,  N_/VzELBeains_Por_Balch  -  1 
j  =  0,  N-Seiisors  -  1 
fr  complex 


A. 5.4  Spatially  Filter  Frequency  Bin  Data  Process  Algorithm 


For  each  il  in  0,  NJ'reqJlands  -  1 

For  each  i2  in  0,  N.Freq-Bins.Per-Band  -  1 

i  =  LFreq_Bin-First  +  il*N_Freq.Bins.Per.Band  +  i‘2 
For  each  k2  in  0,  N_AzEl-Bcams.Per_Batch  -  1 
For  each  j  in  0,  N-Sensors  -  1 
fr[il,  i2,  kl,  k2,  j] 

=  va(il,  i2,  k2.  j]*yr[i,  j] 


F^nd  for 
End  for 
End  for 
End  for 


A. 5. 5  Spatially  Filter  Frequency  Bin  Data  Process  Special  Re¬ 
quirements 

The  index  kl  associated  with  the  current  AzEl  batch  is  under  control  of 
the  loop  ’’While  more  AzF^l  batches  to  read”. 

A. 5. 6  Spatially  Filter  Frequency  Bin  Data  Process  Validation 
Criteria 

The  following  test  shall  be  employed  to  validate  the  process: 

Raw_F'requency_Bin-Data  and  AzloLVectors  shall  be  synthesized  such 
that  the  real  part  of  the  Raw_Filtered-NB.Data  will  be  equal  to  the  sensor 
number  and  the  imaginary  part  will  be  encoded  with  the  frequency  bin 
number  and  the  vector  number,  dhe  process  will  be  run  and  the  output 
examined  for  correctness. 
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A. 6.0  Distribute  and  Sum  NB  Data  by  Subarray  Process 
The  Distribute  and  Sum  NB  Data  by  Subarray  Process  routes  frequency 
bin  data  so  that  each  PE  has  weighted  data  for  selected  AzEl  beams  for 
all  sensors  for  selected  frequency  bins.  The  PEs  then  sum  the  weighted 
sensor  data  to  to  form  a  narrowband  time  series  for  each  AzEl  beam  for 
each  subarray. 

A. 6.1  Distribute  and  Sum  NB  Data  by  Subarray  Process  In¬ 
puts 


Raw.Filtered-NB_Data: 

fr[il,  i2,  kl,  k2,  j]  il  =  0,  N-Freq.Bands  -  1 

i2  =  0,  N-Freq-Bins-Per.Band  -  1 
kl  =  0,  ...,  N_AzEl-Batches  -  1 
k2  =  0,  ...,  N_AzEl-Beams.Per.Batch  -  1 
j  =  0,  ...,  N-Sensors  -  1 
fr  complex 


Parameters: 

N-Freq.Bands 

N_Freq-Bins.Per.Band  (=  N.Freq.Bins.Out  /  N.Freq.Bands) 
N  j\.zEl-Bcams 
N.AzELBatches 
N  -AzEl-Beams-Per.Batch 
N  .Subarrays 

I_First.Sensor[s]  s  =  0,...,N.Subarrays  -  1 

I-Last.Sen.sor[s]  s  =  0,...,N.Subarrays  -  1 

Time  Index: 

I -Time 


A. 6. 2  Distribute  and  Sum  NB  Data  by  Subarray  Process  In¬ 
put  /Outputs 

none 

A. 6. 3  Distribute  and  Sum  NB  Data  by  Subarray  Process  Out¬ 
puts 


S  u  ba  rray  .AzEl  JN'  B  .T  i  me  .Series : 

yh[il,  12,  kl,  k2,  s,  n]  il  =  0,  ....  N.Freq.Bands  -  1 
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i2  =  0,  N-Freq-Bins_Per_Band  -  1 

kl  =  0,  N-AzEl-Batches  -  1 

k2  =  0,  N-AzElJ3eams_Per-Batch  -  1 

s  =  0,  N-Subarrays  -  1 

n  =  0,  N-Retained_Times  -  1 


A. 6.4  Distribute  and  Sum  NB  Data  by  Subarray  Process  Al¬ 
gorithm 


Define  C(m)  =  m  mod  N -Retained .Times 
For  each  il  in  0,  ....  N-Freq-Bands  -  1 

For  each  i2  in  0,  N.Freq.Bins-Per.Band  -  1 
For  each  kl  in  0,  N_Az  El -Batches  -  1 

For  each  k2  in  0,  N-AzEI-Beams-Per.Batch  -  1 
For  each  s  in  0,  N-Subarrays  -  1 
yh[il,  i2,  kl,  k2,  s,  C(LTime)]  =  0 
For  each  j  in  0,  LFirst .Sensor [s],  LLast.Sensor[s) 

yh[il,  i2,  kl,  k2,  s,  C(I.Time)]  = 

yh[il,  i2,  kl,  k2,  s,  C(I_Time)]  +  fr[il,  i2,  kl,  k2,  j] 
End  for 
End  for 
End  for 
End  for 
End  for 
End  for 


A. 6. 5  Distribute  and  Sum  NB  Data  by  Subarray  Process  Spe¬ 
cial  Requirements 

none 

A. 6. 6  Distribute  and  Sum  NB  Data  by  Subarray  Process  Vali¬ 
dation  Criteria 

The  following  test  shall  be  employed  to  validate  the  process: 

(i)  The  frequency  index,  sensor  index,  and  AzEl  beam  index  shall  be  en¬ 
coded  into  each  data  value  of  Raw-Filtered.NB-Data.  The  Subarray -AzELNB.Time-Series 
values  yh[il,  i2,  kl,  k2,  s,  n]  shall  then  be  examined  for  correctness. 

A. 7.0  Factor  Subarray  Matrix  Process 
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The  Factor  Subarray  Matrix  Process  updates  X  and  performs 
its  QR  factorization,  where  A  =  XX'  is  the  cross-spectral  matrix. 

A. 7.1  Factor  Subarray  Matrix  Process  Inputs 


Subarray -AzEl_NB-Time.Series; 
yh[il,  i2,  kl,  k2,  s,  n] 


Parameters; 

NjSensors 
N_Freq. Bands 
N.Freq-Bins.Per_Band 
N. Retained. Times 
N.TimeJvIax 
N.Subarrays 
N  J^zELBeams 
N-AzELBeams-Per.Batch 
N.AzEl-Batches 
QR-Parameters; 

Inverse.Condition.Number-Threshold 
Time  Jndex: 

LTime 


11  =  0,  N-Freq.Bands  -  1 

12  =  0,  N-Freq_Bins_Per_Band  -  1 
kl  =  0,  N_AzElJlatches  -  1 

k2  =  0,  N_AzEl_Beams-Per_Batch  -  1 
s  =  0,  N.Subarrays  -  1 
n  =  0,  N-Retained_Times  -  1; 
yh  complex 


(=  N.Freq.Bins  /  N.Freq.Bands) 


A. 7. 2  Factor  Subarray  Matrix  Process  Input/Outputs 

none 

A. 7, 3  Factor  Subarray  Matrix  Process  Outputs 


Data_Matrix-Factorization; 

to[il,  i2,  kl,  k2,  m,  n]jl  =  0,  N.FreqJlands  -  1, 

i2  =  0,  N.Freq  Jlins.Per.Band  -  1, 

kl  =  0,  ....  N_AzELBatches  -  1, 

k2  =  0,  N.AzELBeams.Per. Batch  -  1, 


A-n 


m  =  0,  N.Subarrays  -  1 
n  =  0,  N_Subarrays  -  1 
to  complex 


A. 7.4  Factor  Subarray  Matrix  Process  Algorithm 


Define  C(L)  =  L  mod  N_Retained_Times 

Define  X  to  be  a  matrix  (N_Subarrays  by  N_Retained_Times)  such  that 
X[m,  n]  =  yh[il,  i2,  kl,  k2,  m,  n],  m  =  0,  N-Subarrays  -  1, 

n  =  0,  N_Retained_Times  -  1 
(one  such  X  for  each  value  of 
(il,i2,  kl,k2)) 

Define  T  to  be  an  upper  triangular  matrix  (N.Subarrays  by  N-Subarrays) 
such  that 

T[m,  n]  =  to[il,  i2,  kl,  k2,  m,  n],  m  =  0,  N.Subarrays  -  1 

n  =  0,  N^ubarrays  -  1 
(one  such  T  for  each  value  of 
(il,i2,  kl,k2)) 

If  LB.ME.Flag  =  0  then  return 
If  LTime  <  N.Retained-Times  then  return 
For  each  il  in  0,  NJreqJBands  -  1 

For  each  i2  in  0,  N.Freq_Bins_Per_Band  -  1 
For  each  kl  in  0,  N_AzEl-Batches  -  1 

For  each  k2  in  0,  N-AzELBeams.Per. Batch  -  1 

Matrix  computation;  T  =  QR_Factorizat!on(QR_Parameters;  X) 
End  for 
End  for 
End  for 
End  for 


A. 7. 5  Factor  Subarray  Matrix  Process  Special  Requirements 

The  inverse  condition  number  shall  be  monitored;  if  it  falls  below 
Inverse.Condition.Number-Threshold,  a  diagnostic  message  shall  be  pro¬ 
duced. 

A. 7.6  Factor  Subarray  Matrix  Process  Validation  Criteria 

The  matrices  T,  X  should  satisfy  the  condition 
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T7”  =  A' A* 

where  *  denotes  conjugate  transpose. 

A. 8.0  Distribute  Replica  Vectors  Process 

The  Distribute  Replica  Vectors  Process  accesses  replicas  from  mass  stor¬ 
age  and  routes  them  to  appropriate  PEs. 

A. 8.1  Distribute  Replica  Vectors  Process  Inputs 


RawJleplica-Vectors; 

TDD 

Parameters: 

N_Freq_Bands 

N-Freq.Bins-Per-Band 

N_AzELBatches 

N  .AzELBeams.Per.Batch 

N^ubarrays 

N -Retained. Times 

N-Replicas 

N-Replicas-Per. Batch 
N  -Replica.Batches 


A. 8. 2  Distribute  Replica  Vectors  Process  Input/Outputs 
TBD 

A. 8.3  Distribute  Replica  Vectors  Process  Outputs 


Replica.Vectors; 

vi[il,  i2,  kl,  k2,  r,  s],  il  =  0,  ....  N.Freq.Bands  -  I 

i2  =  0,  ....  N-Frcq-Bins-Per.Band  -  1 
kl  =  0,  ....  N_AzELBatchos  -  1 
k2  =  0,  ....  N-AzEl-Beams.Per.Batrh  -  1 
r  =  0,  ....  N_Replica.s_Per. Batch  -  1 
s  =  0,  ....  N_.Subarrays  -  1 
vi  complex 


A. 8. 4  Distribute  Replica  Vectors  Process  Algorithm 
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TBD 

A. 8. 5  Distribute  Replica  Vectors  Process  Special  Requirements 

TBD 

A. 8. 6  Distribute  Replica  Vectors  Process  Validation  Criteria 

TBD 

A. 9.0  Compute  Output  Power  and  Narrowband  Time  Series 
Process 

The  Compute  Output  Power  and  Narrowband  Time  Series  Process  forms 
and  outputs  either  Bartlett  or  Minimum  Energy  power  for  a  set  of  input 
replica  vectors. 

A. 9.1  Compute  Output  Power  and  Narrowband  Time  Series 
Process  Inputs 


Subarray_i\zELNB -Time  .Series; 
yh[i  1,  i2,  kl,  k2,  s,  n] 


Data_.Matrix-Factorization: 
to[il,  i2,  kl,  k2,  m,  n). 


Replica.Vectnrs; 

vi[i  1 ,  i2,  k  1.  k2,  r,  s], 


11  =  0,  N.Freq.Bands  -  1 

12  =  0 . N.Freq.Bins-Per.Band  -  1 

kl  =  0,  N_AzEl_Batches  -  1 

k2  =  0,  N-AzEl_Beams.Per.Batch  -  1 
s  =  0,  N-Subarrays  -  1 
n  =  0,  N-Retained.Times  -  1; 
yh  complex 

11  =  0,  N.Freq.Bands  -  1, 

12  =  0.  N.Freq.Bins J’er.Band  -  1, 
kl  =  0,  N jXzElJJatches  -  1, 

k2  =  0,  N-AzEl  Jleams.Per.Batch  -  1, 

m  =  0,  ....  N-Subarrays  -  1 
n  =  0,  ...,  N.Subarravs  -  1 
to  complex 

11  =  0,  ...,  N.Freq.Bands  -  1 

12  =  0 . N.Freq.BinsJPer.Band  -  1 

kl  =  0,  ...,  N j\zEl Jlatches  -  I 

k2  =  0,  NJ\izEl  Jleams.Per.Batch  -  1 

r  =  0,  ...,  N  Jleplicas.Per.Batch  -  1 
s  =  0,  ...,  NJSubarrays  -  1 
vi  complex 
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Parameters; 

NJFreq  .Bands 
N  J'req.Bins.Per.Band 
N.AzELBatches 
N  j\zELBeams_Per.Batch 
N  .Subarrays 
N  Jletained.Times 
N  Jleplicas.Per.Batch 
I.BJ.IE.Flag 
Time  Jndex: 

I.Time 


A. 9. 2  Compute  Output  Power  and  Narrowband  Time  Series 
Process  Input/Outputs 

4.9.3  Compute  Output  Power  and  Narrowband  Time  Series 
Process  Outputs 


B  aw  .0  u  t  p  u  t .  Powe  r : 

p[il.  i2,  kl,  k2.  r].  il  =  0 . N.Frcq.Bands  -  1 

i2  =  0,  N.Freq.Bins.Per.Band  -  1 

kl  =  0,  ....  N.AzE!J5atches  -  1 

k2  =  0,  ....  N.AzEl.Beams.Per_Batch  -  1 

r  =  0 . N  Jleplicas.Per.Batch  -  1 

p  real 

R  aw  .N  arrowband  .Ti  tne.Series; 

TBD 

A. 9.4  Compute  Output  Power  and  Narrowband  Time  Series 
Process  Algorithm 


Define  v  to  be  a  vector  (length  N. Subarrays)  such  that 

v[s]  =  vi[il,  i2,  kl.  k2,  r.  s]  s  =  0,  ...,  N.Subarray,s  -  1 

(one  such  v  for  each  value 
of  (il,  i2,  kl,  k2,  r)) 

Define  X  to  be  a  matrix  (N..Subarrays  by  .X.Retained.Times)  such  that 
X[m,  n]  =  yh[il.  i2,  kl,  k2,  m,  n]  m  =  0,  ...,  N.Subarrays  -  1 
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n  =  0,  N-Retained.Times  -  1 
(one  such  X  for  each  value 
of  (il,  12,  kl,  k2)) 

Define  T  to  be  an  upper  triangular  matrix  (N.Subarrays  by  N^ubarrays) 
such  that 

T[m,  n]  =  to[il,  i2,  kl,  k2,  m,  n]  m  =  0,  ...,  N-Subarrays  -  1 

n  =  0,  ...,  N-Subarrays  -  1 
(one  such  T  for  each  value 
of  (il,  i2,  kl,  k2)) 

If  LTime  <  N-Retained.Times  and  l_BJV1E.Flag  =  1  then  return 

For  each  il  in  0,  ....  N-F'reqJIands  -  1 

For  each  i2  in  0,  ...,  N.F'req.Bins.Per.Band  -  1 
F"or  each  kl  in  0,  ....  N_f\zELBatches  -  1 

For  each  k2  in  0,  ...,  N-AzEIJReams.Per.Batch  -  1 
F'or  each  r  in  0,  ...,  N_Replicas_Per.Batch  -  1 
Switch  on  I-B.ME.Flag 
Case  0: 

Matrix  computation;  w  =  A'n* 

Matrix  computation:  p[il,  i2,  kl,  k2,  r]  =  w'xv 
End  case  0 
Ca.so  1: 

Matrix  computation:  w  =  Backsolve(T;v) 

Matrix  computation:  p[il,  i2,  kl,  k2,  r]  =  (tn*u’)“' 

End  ca.se  1 
End  switch 
Fmd  for 
F’lul  for 
F>nd  for 
F]nd  for 
F.nd  for 


A.9.5  Compute  Output  Power  and  Narrowband  Time  Series 
Process  Special  Requirement 

A. 9.6  Compute  Output  Power  and  Narrowband  Time  Series 
Process  Validation  Criteria 

A. 10.0  Collect  from  PEs  by  Frequency  Band  Process 

t  he  Collect  from  PEs  by  F'requency  Band  Process  routes  output  power 
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and  narrowband  time  series  for  output  to  mass  storage. 

A. 10.1  Collect  from  PEs  by  Frequency  Band  Process  Inputs 


Raw-Output_Power; 

p[il,  i2,  kl,  k2,  r],  il  =  0,  N.Freq_Bands  -  1 

i2  =  0,  N-Freq_Bins-Per.Band  -  1 
kl  =  0,  N  j\.zELBatches  -  1 

k2  =  0.  N_AzELBeams.Per_Batch  -  1 
r  =  0,  N_Repliras-Per.Batch  -  1 
p  real 

R  aw  .Narrow  band. Time. Series; 

TBD 

Parameters: 

N.Freq. Bands 
N-Freq.Bins.Per.Band 
N  .A  z  El -Batches 
N./\zEl-Beams. Per. Batch 
N. Replicas. Per. Batch 


A. 10. 2  Collect  from  PEs  by  Frequency  Band  Process  Input/Outputs 
TBD 

A.  10.3  Collect  from  PEs  by  Frequency  Band  Process  Outputs 


Output. Power; 

TBD 

Narrowband-Time.Series: 

TBD 


A. 10.4  Collect  from  PEs  by  Frequency  Band  Process  Algo¬ 
rithm 
TBD 

A. 10.5  Collect  from  PEs  by  Frequency  Band  Process  Special 
Requirements 
TBD 
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A. 10.6  Collect  from  PEs  by  Frequency  Band  Process  Validation 
Criteria 
TBD 
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